Linearity and Scaling of a Statistical Model for the Species Abundance Distribution 
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We derive a linear recursion relation for the species abundance distribution in a statistical model 
of ecology and demonstrate the existence of a scaling solution. 
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I. INTRODUCTION 

Understanding the relationship between species rich- 
ness in a biome and its corresponding area is a long- 
standing problem in ecology, providing important infor- 
mation about species richness, extinction of species due 
to habitat loss and the design of reserves |Q] . 

Among the most usually cited mathematical functions 
relating the number of different species (S) and the area 
they occupy (A) is the power law form of the species area 
relationship (SAR): S — cA z . In a paper by Harte et al. 
B this result was shown to be equivalent to assuming 
self-similarity in the distribution of species. Furthermore, 
the species-abundance distribution, Po(n), the fraction of 
species with n individuals was found to satisfy a nonlinear 
recursion relation. 

Banavar et al. went on to show that this model exhibits 
scaling data collapse in the same way as observed in the 
two dimensional XY model and in the power fluctuations 
in a closed turbulent flow [9, a result that follows from 
hyperscaling 0). 

The purpose of this paper is to show that the nonlinear 
recursion relation can be recast as a linear recursion rela- 
tion for the species-abundance distribution that is much 
easier to handle; indeed, since the equation governs a 
probability distribution, it natural to expect that a linear 
equation is obeyed. By means of this recursion relation 
we derive the scaling function assumed by Banavar et al. 



II. THE MODEL AND THE NONLINEAR 
RECURSION RELATION 

In the model proposed by Harte et al. M an area Aq 
with a number of species Sq is considered. The num- 
ber of individuals in each species is described by Pq (n) , 
where SaPo(n) is the expected number of species with n 
individuals. The area Aq is chosen to be in a shape of a 
rectangle with its length being \[2 times its width; such 
that by a bisection along the longer dimension it can be 
divided in two rectangles of shape similar to the original 
(see figure |l|). Ai = A /2 l is the area of a rectangle after 
the zth bisection. If a species is present in an area Ai, and 
nothing else is known about the species, there are three 



possibilities: it might be present only on the right sub- 
partition of area Aj_i (probability P(R'\L)), only on the 
left one (P(R\L'j) or in both (P(R'\L')). By symmetry 
P(R'\L) = P(R\L'); and a is defined as P(R'\L) =l-a. 
The probability of finding a species on the right side, 
independently of what happens on the left side is: 



P(R') = P(R'\L) + P(R'\L r ) = 1 - a- 
= P(L') by symmetry 



■2a-l = a (1) 



Self similarity is introduced by stating that a is indepen- 
dent of i, that is, scale. 

Two conclusions can be derived from this: a species 
area relationship of the kind S = cA z with a = 2~ z and 
a recursion relation for Pj (n) (expected fraction of species 
with n individuals for an area Ai , see figure [j]) [|| : 

71-1 

Pi{n) = xP l+l + (1 - x) J2 p i+i(n - k)P l+ i(k) (2) 
fe=i 

where x = 2(1 — a). This recursion relation requires an 
initial condition. It is supplied by defining a minimum 
patch A m = Aq/2" 1 , such that it contains on average only 
one individual (see figure g). Consequently, P m (n) = S n i. 
This also limits the maximum number of individuals that 
can be found in a patch Ai to 2 m ~ l so Pi(n) = for 
n> 2 m '\ 



III. THE LINEAR RELATION 

Equation His nonlinear, and difficult to handle analyt- 
ically. The purpose of this section is to derive an equiv- 
alent linear relation to calculate Pi(n). This derivation 
sums up multiple patches at once, rather than proceeding 
strictly hierarchically as in the original derivation. 

We consider that the contributions to Pi (n) come from 
several (2 J_l ) patches of area Aj = Ai/2 : > ("boxes") in- 
stead of from 2 patches of area A{ + i = Ai/2 as before 
(see figure |2|). The probability of finding n individuals 
in Ai is then the sum over the probabilities of finding r 
of these "boxes" with the species present (RUr)), multi- 
plied by the probability of finding n individuals in these 
r boxes (Q*-(r, n)): 
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FIG. 1. Explanation of Equation |2| . Let's consider the 
case i = 4 and n — 3. Circles correspond to individuals of 
a particular species found in a patch. On the left side there 
are three individuals in a patch A4, on the right side all the 
possible ways in which those 3 individuals can be distributed 
in the two patches A3 . The probability of finding three indi- 
viduals in a patch A4 is then the addition of the probability 
that all the individuals are on one side (prob. 1 — a) times the 
probability that once all the individuals are on one side there 
are no individuals on one side and there are three individuals 
on the other side (prob. 1 * Ps(3)) plus the probability that 
the species are present on both sides (prob. 2(1 — a)) times 
the probability that once the species are present on both sides 
there are two individuals on one side and 1 individual on the 
other one (prob. P 5 (2) * P 5 (l)). Taking x = 2(1 - a) and 
1 - x = 2a - 1 we find P 4 (3) = xP 5 (3) + (1 - a;)2P 5 (2)P 5 (l). 
This can be generalized to obtain Equation bl Figure taken 
from 01. 



Pi(n) 



E 



R)(r)Q){r,n) 



(3) 



Note that the index j is not summed over. It is arbitrary, 
indicating the size of the "box" . For j = i + 1 there are 
two boxes of area Ai/2 and the original result of Harte 
et al is recovered, whereas for j = m — 1 we will find a 
linear relation. But before establishing these results we 
calculate explicitly RUr) and QUr,n): 

• Q 1 a (r, n) is the probability of finding n individuals 
in r boxes of size Ai/2 : > in a total area A^. 

Q){r,n)= (4) 

(nr=i^(now»-En*) r <y~ i 

r > y- 1 



E 

ni ...n n — 1 



This formula is the probability of finding n\ indi- 
viduals in the first box, ni in the second one, ... etc 
while the Kronecker delta limits the possibilities to 
those that add up to the total number of individuals 
n. 2 J ~ l is the maximum number of boxes and 2 m ~' J 
is the maximum number of individuals in each box. 

• R l Ar) is the probability of finding r boxes of size 
Aj in which the species is present, in a total area 
Ai. This is just: 



R){r) = Pm+i-iir) 



(5) 



This follows because the reasoning expressed in fig- 
ure [j] can be applied to find the same recursion 
relation for i?*(r) as for Pi(n): 
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FIG. 2. A m is the minimum patch, 
prises two minimum patches, but it can be of any size. In 
Equation H the contributions to Pi(n) come from the two 
patches of size Ai+i, whereas in the case of the linear re- 
cursion relation they come from the 2 J ~ 1 patches of size Aj. 



RUr) = xRl + \r) + (1 - x) V R\ +1 (k)R\ +1 {r - k) 



k=l 



(6) 



The initial conditions do not change either, with 
R?Ar) — 8 r \. The only difference with the deriva- 
tion for Pi (n) is that r refers to the number of boxes 
(not individuals) and that the recursion has to be 
applied j —i times instead of m — i times. 

We can now check that for j = i + 1 wc find the same 
result as before: 



Pi(n) = J2 I %(r)Qi(r,n) 

r 

= R\(l)Q\(l,n)+R\(2)Q\(2,n) 
Reading off from Equation (4): 



(7) 



n-1 

Q\{2,n)=J2^+i(k)P l+ i(n-k) (8) 
fc=i 

Qi(l J n)=P i+ i(n) (9) 

R\(l) = x (10) 

i?i(2) = l-x (11) 

Hence we obtain: 

i»-(n) - xF,+iW + (1 - x) Y Pi+i(k)P i+1 {n - k) 

(12) 



fe=i 



as announced previously. To obtain a linear relation we 
set j = m — 1 and obtain: 

Q m _i(r,n)= J! C[[P m -i(ni))t(n-Y, ni ) ( 13 ) 

ni,...n r - 1 / — 1 i 

For P m _i(n) we only have the following possibilities: 



x n = 1 

•Pm-i(n) = ^ 1-3 n = 2 
n ^ 1,2 



(14) 



We find, denoting by g = n - r the number of boxes 
with two individuals (factors of P m _i(2) in the equation 
above): 



j(n,r) = Q m -i(r,n) = 



{r-q)\q\ 



x r - q (l-x) q (15) 



The first factor is the number of possible configurations 
in which there are q boxes with two individuals and n — q 
with one individual. Finally we obtain: 



P i (n)= Y, P*+i(r)s(n,i 



(16) 



which is a linear relation involving Pi(n) and Pi+i(n). 



IV. THE SCALING LAW 

Equation (13) allows us to derive the scaling law that 
was assumed by Banavar et al. jq]: 



Pi(n) = -f(^r) 



(17) 



where Ni (= 2 m ~ l ) is the maximum number of individu- 
als in an area A, and (f> = 1 — z. 

In order to achieve this, the following has to be done: 



First, find the continuum limit for g(r,n). Since 
g(r, n) is just a binomial distribution, it tends to a 
gaussian for large n: 

9(n,r)= . H ., - x r -«(l-xy 



q)\q\ 



1 



1 



/2irr \/x(l — x) 
1 1 



exp 



2a 



exp 



1 (q — rx) 2 

2 rx(l — x) 

(r - n/2a) 2 



(18) 



V2(2a-l)(l-a)r/(2a) 2 



(19) 



g{n, r) is the probability of finding n individuals in 
r boxes. This probability is highly peaked around 
n = 2ar, since 2a (= l(l-a)+l(l-a)+2(2a-l) ) is 
the average of individuals per box. The more boxes 
there are (bigger r) the sharper the peak. This 
means that for large r the only relevant values of 
n are those near n = 2ar and the expression given 
above for g(n, r) is valid for large r (which implies 
large n). 

• Second, rewrite everything in terms of a new vari- 
able x and a new probability density P%{x). x re- 
places n and is the fraction of the total number of 
species: n/Ni (which varies from to 1). Pj(n) 
is the density probability Pi(x)/(l/2 m ~ l ), where 
l/2 m ~* - 1S ^e distance between two points in the 
new variable x. In this way all Pi{n) can be com- 
pared with each other in equal terms. 

In terms of these new variables, the recursion relation 
can now be written as: 
1 
Pi(x) = 2 Y 0(2 m " i a:,2 m - i - l y)P i+ r{y) (20) 

y = l/2'"-'-l 

The continuum limit is found by taking m (and conse- 
quently the number of points iV,-+i = 2 m ~ l ~ 1 ) to an ar- 
bitrarily large value and using the continuum limit of 
g(r, n) as defined above. The fact that the approxima- 
tion for g(r, n) is not a very good one for small values of 
n or r is of little importance in the limit of large m : 

i x2 m- -ly)p. +1 (y)l/ 2 — -1 

Ay 

g*(x,y)P i+1 (y)dy (21) 



Pi(x) = 22' 



"E^ 



where g*{x,y) = 22 m - l - l g{2 m ~ l x, 2 m ~ l - 1 y) and is equal 
to -S(y — x/a) in the limit of large m: 



(x,y) 



1 1 1 



Vn2ae v ,a 2 ( 
1 1 



-l)/2 



exp[ 



.{y-x/af „ m _, L _ u 



e„,a^ 



,)i 



y/if2aey ya 5 



1 {y - x/a) 2 

exp- 



(e v , a sy 



(22) 



which is a standard representation of the Dirac delta 
function |6J] in the x/a variable for e y . a S — > oo (or 
m — > oo): 



lim / g*(x,y)f(x)dx = f(ay) 



=> lim g*(x,y) = -S(y - x/a) 
This implies that 

Pi(ar) = -P !+ i(x/a) 
a 

which is, in terms of n and -Pj(n), 



P*(n) = — P i+ i(n/2a) 

2a 



(23) 



(24) 



(25) 



Since a = 2 z and <j> = 1 — z, multiplying the above equa- 
tion by n and writing the explicit dependence of Pi(n) 
on Ni as P l {n) = P(n, N t ): 



nP(n,Ni) = — P(n/2a,7V l+ i) 
2a 



2 a 



P(n/2a,Ni/2) (26) 



=* /(n,^) = nP(n, JV 4 ) = ^P(n/2^, iV,/2) 

which is by definition f(n/2^,Ni/2). Since iVj is equal 
to a power of two this means that nPi(n) is a function 
only of n/N[: 



Pi(n) = -f(^r) 



(27) 
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FIG. 3. Scaling function nPo(n) = }{n/Nf) for m = 8 and 
m — 9, and z = 0.4 and 2 = 0.5. n* = n/2^ = n/2a. 
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As can be appreciated from the results above, a constant 
a (not dependent on i) is necessary to obtain the scaling 
law: otherwise 4> would depend on i. In figure |3| we ex- 
hibit the scaling function for several z and demonstrate 
the scaling law. 
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